Thanks to Joel Dreessen from MDE for helping me make sense of this monitoring data!
library(tidyverse)
library(lubridate)
library(here)
If you have an R Project and your data in the same folder as this script, you should be in good shape, but it’s always good to check! This is the only time we need to use the here package.
here()
## [1] "P:/Webinars/2019 Webinars/2019--12Dec3-19--R Training, Jenny St. Clair/Materials/RTraining_JS_IN_PROGRESS/RTraining/Webinar1"
data <- data %>% select (-c(X,time))
data$Site.AQS.split <- as.character(data$Site.AQS)
data<- data %>% separate(Site.AQS.split, c("State.FIPS", "County.AQS"), 2)
data <- data %>% separate(County.AQS, c("County.FIPS", "Site"), 3)
data$Site.AQS <- as.factor(data$Site.AQS)
data$State.FIPS <- as.factor(data$State.FIPS)
data$County.FIPS <- as.factor(data$County.FIPS)
data$Method <- as.factor(data$Method)
data <- data %>% filter(State.FIPS == "24")
data$date <- data$date %>%
as_datetime()
str(data)
## 'data.frame': 724 obs. of 8 variables:
## $ o3 : int 30 31 33 33 32 29 29 29 33 36 ...
## $ Agency : Factor w/ 5 levels "MD1","NJ1","NY1",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ Site.AQS : Factor w/ 68 levels "240031003","240051007",..: 13 13 13 13 13 13 13 13 13 13 ...
## $ Method : Factor w/ 2 levels "47","87": 1 1 1 1 1 1 1 1 1 1 ...
## $ date : POSIXct, format: "2019-10-28 00:00:00" "2019-10-28 01:00:00" ...
## $ State.FIPS : Factor w/ 3 levels "24","34","36": 1 1 1 1 1 1 1 1 1 1 ...
## $ County.FIPS: Factor w/ 37 levels "001","003","005",..: 13 13 13 13 13 13 13 13 13 13 ...
## $ Site : chr "9001" "9001" "9001" "9001" ...
The two main functions necessary for a ggplot graphic are the ggplot() and geom_point() functions (connected by a “+”). Data is fed into the ggplot(). There must be an aes() function (short for aesthetic) inside one of these, and this is where you specify which variables you’d like to examine. Just like in cartesian coordinates, aes() will look for x and then y.The geom_…() function specifies the geometry we’d like to use. Other commonly used geometries are geom_col() and geom_line(). This plot tells us the extent of our data, which is helpful, but we need to know more. This dataset also includes the site number that each ozone value was recorded at, as well as the method used to record it.
TLDR: ggplot requires data fed into ggplot(), a geometry specified by geom_…(), and x/y variables specified inside aes(), which goes inside either ggplot() or geom_…().
data %>%
ggplot(aes(x = date, y = o3)) +
geom_point()
If we want to see which point came from which site, we can do so by using color. This means adding another variable aesthetic, so it goes inside an aes() function.
Side note: the last argument on this plot adds a theme to the plot. There are many different themes you can use, and I HIGHLY recommend using one anytime you share a plot. It is an easy way to make your work look sharp. I recommend theme_minimal(), theme_light(), theme_bw(), or theme_void(). Be sure to take a few minutes to try these out and pick your favorite! Looks matter.
data %>%
ggplot(aes(x = date, y = o3)) +
geom_point(aes(color = Site.AQS)) +
theme_bw()
data %>%
ggplot(aes(x = date, y = o3)) +
geom_point(aes(color = Site.AQS))+
theme_bw()
data %>%
ggplot(aes(x = date, y = o3)) +
geom_line(aes(color = Site.AQS)) +
theme_bw()+
facet_wrap(~Site.AQS)
data %>%
ggplot(aes(x = date, y = o3))+
geom_line(aes(color = Site.AQS))+
theme_bw()+
facet_wrap(~Site.AQS)+
theme(axis.text.x = element_text(angle = 90), legend.position = "")
data %>%
ggplot(aes(x = date, y = o3))+
geom_line(aes(color = Site.AQS), size = 1.2)+
theme_bw()+
facet_wrap(~Site.AQS)+
theme(axis.text.x = element_text(angle = 90), legend.position = "")
data %>%
ggplot(aes(x = date, y = o3))+
geom_line(aes(color = Site.AQS), size = 1.2)+
theme_bw()+
facet_wrap(~Site.AQS)+
theme(axis.text.x = element_text(angle = 90), legend.position = "")+
labs(title = "Hourly Ozone at Maryland Sites over 36 Hours", x = "Date/Time", y = "Ozone PPB")
Hint: you can add the subtitle inside the labs() function.
Hint: Look back to where we got rid of the legend.
data %>%
ggplot(aes(x = date, y = o3)) +
geom_line(aes(color = Method), size = 1.2) +
theme_bw()+
facet_wrap(~Site.AQS)+
theme(axis.text.x = element_text(angle = 90))+
labs( title = "Hourly Ozone at Maryland Sites over 36 Hours",
subtitle = "Method 47 = Ultraviolet Photometry, Method 87 = Ultraviolet Radiation Absorption")
Remember, when you run into errors and weird results, don’t spend too much time scratching your head. If you can’t think of a solution after a couple minutes, google it! StackExchange/StackOverflow is your friend too. If you are really stuck, shoot me an email.
https://ggplot2.tidyverse.org/reference/geom_histogram.html
plot <- data %>%
ggplot(aes(o3)) +
geom_histogram(aes(fill = Site), color = "black") +
theme_bw()
library(plotly)
ggplotly(plot)